幻想运动让粉丝管理他们最喜欢的运动员团队并与朋友竞争。幻想平台对抗运动员的真实统计表现,幻想得分,稳步上升,估计每月有44亿球员的估计为910万名球员,2018 - 2019年的ESPN幻想足球平台。同时,体育媒体社区并行产生新闻报道,博客,论坛帖子,推文,视频,播客和幻想运动内外的曲目。然而,人类幻想足球运动员只能分析3.9个信息来源。我们的工作讨论了机器学习管道的结果来管理ESPN幻想足球队。每天使用训练有素的统计实体探测器和文档2Vector模型应用于超过100,000个新闻源和230万件文章,视频和播客使系统能够理解自然语言,这些自然语言具有100%和关键字测试精度为80%的类别。深度学习前馈神经网络提供了播放器分类,例如,如果玩家将是一个胸围,繁荣,用隐藏的伤害玩或玩有意义的触摸,累计72%的准确性。最后,多元回归集合使用深度学习输出和ESPN投影数据,为2018年为前500多个幻想足球运动员提供了一个点投影。点投影保持了6.78点的RMSE。选择来自一组24的最佳拟合概率密度函数以可视化分数扩展。在产品发布的前6周内,用户总数花了46年来观看我们的AI洞察力。我们模型的培训数据由2015年到2016年的Web档案提供,来自Webhose,ESPN统计和Rootowire损伤报告。我们使用2017年幻想足球数据作为测试集。
translated by 谷歌翻译
Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assisted by these solutions, we observe that neurons in ViTs trained with language model supervision (e.g., CLIP) are activated by semantic concepts rather than visual features. We also explore the underlying differences between ViTs and CNNs, and we find that transformers detect image background features, just like their convolutional counterparts, but their predictions depend far less on high-frequency information. On the other hand, both architecture types behave similarly in the way features progress from abstract patterns in early layers to concrete objects in late layers. In addition, we show that ViTs maintain spatial information in all layers except the final layer. In contrast to previous works, we show that the last layer most likely discards the spatial information and behaves as a learned global pooling operation. Finally, we conduct large-scale visualizations on a wide range of ViT variants, including DeiT, CoaT, ConViT, PiT, Swin, and Twin, to validate the effectiveness of our method.
translated by 谷歌翻译
Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they stealing content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication. We also identify cases where diffusion models, including the popular Stable Diffusion model, blatantly copy from their training data.
translated by 谷歌翻译
One of the most successful paradigms for reward learning uses human feedback in the form of comparisons. Although these methods hold promise, human comparison labeling is expensive and time consuming, constituting a major bottleneck to their broader applicability. Our insight is that we can greatly improve how effectively human time is used in these approaches by batching comparisons together, rather than having the human label each comparison individually. To do so, we leverage data dimensionality-reduction and visualization techniques to provide the human with a interactive GUI displaying the state space, in which the user can label subportions of the state space. Across some simple Mujoco tasks, we show that this high-level approach holds promise and is able to greatly increase the performance of the resulting agents, provided the same amount of human labeling time.
translated by 谷歌翻译
Deep neural networks are susceptible to shortcut learning, using simple features to achieve low training loss without discovering essential semantic structure. Contrary to prior belief, we show that generative models alone are not sufficient to prevent shortcut learning, despite an incentive to recover a more comprehensive representation of the data than discriminative approaches. However, we observe that shortcuts are preferentially encoded with minimal information, a fact that generative models can exploit to mitigate shortcut learning. In particular, we propose Chroma-VAE, a two-pronged approach where a VAE classifier is initially trained to isolate the shortcut in a small latent subspace, allowing a secondary classifier to be trained on the complementary, shortcut-free latent subspace. In addition to demonstrating the efficacy of Chroma-VAE on benchmark and real-world shortcut learning tasks, our work highlights the potential for manipulating the latent space of generative classifiers to isolate or interpret specific correlations.
translated by 谷歌翻译
标准扩散模型涉及图像变换 - 添加高斯噪声 - 以及逆转此降解的图像恢复操作员。我们观察到,扩散模型的生成行为并不是很大程度上取决于图像降解的选择,实际上,可以通过改变这种选择来构建整个生成模型家族。即使使用完全确定性的降解(例如,模糊,掩蔽等),培训和测试时间更新规则是基于扩散模型的培训和测试时间更新规则,可以轻松地概括为创建生成模型。这些完全确定的模型的成功使社区对扩散模型的理解质疑,这依赖于梯度Langevin动力学或变异推理中的噪声,并为反转任意过程的广义扩散模型铺平了道路。我们的代码可从https://github.com/arpitbansal297/cold-diffusion-models获得
translated by 谷歌翻译
对表格数据的深度学习的最新工作表明了深层表格模型的强劲表现,通常会弥合梯度增强的决策树和神经网络之间的差距。除了准确性之外,神经模型的主要优点是它们学习可重复使用的功能,并且在新域中很容易进行微调。该属性通常在计算机视觉和自然语言应用中被利用,在特定于任务的培训数据稀缺时,转移学习是必不可少的。在这项工作中,我们证明上游数据使表格神经网络比广泛使用的GBDT模型具有决定性的优势。我们为表格转移学习提出了一个现实的医学诊断基准,并提出了使用上游数据来通过各种表格神经网络体系结构来提高性能的方法指南。最后,我们为上游和下游特征集不同的情况提出了一种伪特征方法,在现实世界中,特定于表格的问题广泛。我们的代码可在https://github.com/levinroman/tabular-transfer-learning上找到。
translated by 谷歌翻译
新的天文任务通常与已经收集的标签的早期任务有关。我们将对比度框架BYOL调整为利用这些标签作为预处理的任务,同时还可以增强不变性。对于大规模预处理,我们介绍了GZ-EVO V0.1,这是552K星系图像的9650万志愿者响应,再加上另外134万个可比较的未标记星系。206 GZ-EVO答案中的大多数对于任何给定的星系都不为人所知,因此我们的预读任务使用了自然处理未知答案的差异损失。在有或没有混合学习的情况下,GZ-EVO预训练即使有很多下游标签(44K标签的精度为+4%)也可以改善直接训练。我们的混合预处理/对比方法进一步提高了下游准确性,而对比度学习或对比度学习,尤其是在低标签转移方案中(具有750个标签的6%精度)。
translated by 谷歌翻译
从社交媒体中刮擦的数据的流行率是获取数据集的一种手段,这导致人们对未经授权使用数据的关注日益严重。已经提出了数据中毒攻击是一种反对刮擦的堡垒,因为它们通过添加微小的,不可察觉的扰动来使数据“无法透视”。不幸的是,现有方法需要了解目标体系结构和完整的数据集,以便可以训练替代网络,其参数用于生成攻击。在这项工作中,我们引入了自回旋(AR)中毒,这种方法可以生成中毒的数据而无需访问更广泛的数据集。提出的AR扰动是通用的,可以在不同的数据集上应用,并且可以毒化不同的体系结构。与现有的未透视方法相比,我们的AR毒物更具抵抗力的防御能力,例如对抗性训练和强大的数据增强。我们的分析进一步洞悉了有效的数据毒物。
translated by 谷歌翻译
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models. Additionally, we show that performance can be further improved by fine-tuning our general model on specific tasks of interest.
translated by 谷歌翻译